AMENDMENTS TO THE SPECIFICATION 
Please amend the paragraph beginning at page 1 line 1 8, to read as follows: 
Generally, as shown in FIG. 1 A, in order to improve a processing speed of a processor 1, 
a cache 2 is interposed between the processor 1 and a main memory 3. [[In]] From various 
[[kids]] kinds of information (data) stored in the main memory 3, information to which the 
processor 1 frequently [[have]] has access is previously duplicated (copied) in this copied to 
cache 2. Further, when the processor 1 has access t o this accesses cache 2 in place instead of the 
main memory 3, high-speed processing [[ofj] by the processor is enabled. 

Please amend the paragraph beginning at page 2, line 1, to read as follows: 
Therefore, when the processor 1 has access to the accesses cache 2 but target information 
(data) is not stored in the cache 2, a cache miss is generated. When the cache miss occurs, the 
processor 1 reads the target information (data) from the main memory 3 and writes it in the 
cache 2. A minimum unit of information (data) transmitted/received between this main 
memory 3 and the cache 2 is referred to as a unit block. 

Please amend the paragraph beginning at page 2, line 9, to read as follows: 
In recent years, in order to improve a processing speed of the processor 1, a parallel 
processor which executes a plurality of types of processing in one clock cycle as typified by a 
super scalar processor has come into practical use. In the processor 1 executin g performing this 
parallel processing, it is required to simultaneously access an instruction (machine instruction) 
and data (arithmetic operation data) from the cache 2, for example, hi order to simultaneously 
have access to a plurality of sets of information (data) from one memory, one memory must have 
a plurality of ports (write/read terminals). 
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Please amend the paragraph beginning at page 3, line 5, to read as follows: 
It is to be noted that a difference between an access pattern of the instructions and an 
access pattern of the data is as follows. One instruction is constituted by includes a plurality of 
steps which cannot be divided, and continuous contiguous addresses are accessed. Therefore, in 
the access pattern of [[the]] instructions, a required data width ([[bit]] number of bits of data read 
at a time) is large. On the contrary, it is often the case that the data is relatively randomly 
accessed, and hence a required data width (bit number of data read at a time) is small in the 
access pattern of the data. 

Please amend the paragraph beginning at page 3, line 1 6, to read as follows: 
However, op timal storage capacities optimum for the respective caches 4 and 5 are 
different in accordance with each program stored in the main memory 3. Therefore, comparing 
with one cache 2 in which the capacities of the respective caches 4 and 5 are added, a 
fragmentation is gcnerated[[,]] and a utilization efficiency of the storage capacity is reduced. 
Furthermore, a cache miss ratio is increased when a program with a large working set is 
executed. 

Please amend the paragraph beginning at page 4, line 23, to read as follows: 
The detailed operations of the instruction cache 4 and the trace cache 6 will now be 
described m detail with reference to FIG. 2. 

Please amend the paragraph beginning at page 6, line 23, to read as follows: 
Since reduplicative duplicative instructions (basic blocks) exist in the instruction cache 4 
and the trace cache 6, the uti lization efficiency of the enti re caches is reduced. 

Please amend the paragraph beginning at page 10, line 9, to read as follows: 
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Moreover, when the parallel processor specifies the instruction data in the multi-port 
bank memory by using a tech address, a judgment is made [[upon]] based on whether it has 
access as the instruction cache or it has access as the trace cache by using, e.g., a cache hit circuit 
or the like. Based on a result of that judgment, the instruction data is read as the instruction data 
of the instruction cache, or it is read as the trace data of the trace cache. 

Please amend the paragraph beginning at page 10, line 18, to read as follows: 
Therefore, both data of the trace cache and data of the instruction cache can exist in the 
same cache, thereby realizing the effective utilization of the cache capacity. Additionally, 
overlapping duplicati ve storage of the same instruction data can be suppr essed reduc ed. 

Please amend the paragraph beginning at page 12, line 13, to read as follows: 
In order to integrate the access methods, in the present invention, there are provided a bit 
used to i dentify whether data is data of the trace cache and two tags 1 and 2 for access. In access 
based on the instruction cache, only the tag 1 is required. On the other hand, in case of data from 
the trace cache, the tag 1 and the tag 2 are used to compare inferior lower bits of the address, 
start positions of the trace data are compared, and access is effected. 

Please amend the paragraph beginning at page 16, line 10, to read as follows: 
According to a sixth aspect of the present invention, there is provided a multi-port 
instruction/trace/data integrated cache which is provided between a parallel processor executing 
performing a plurality of types of processing in one clock cycle and a main memory, and has a 
plurality of banks which store a part of instructions, traces and data stored in the main memory 
and a pl urality of ports. 

Please amend the paragraph beginning at page 24, line 24, to read as follows: 
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The bank row selection circuit 20 and the bank column selection circuit 21 convert n = 15 
addresses AD inputted from the respective first to 15th ports 1 1 into n row bank selection signals 
RSn, n column bank selection signals CSn, and n in-bank addresses An. The bank structure 1 7 a 
from which [[that]] n sets of data Dn are accessed,, is determined by the n row bank selection 
signal RSn and the n column bank selection signals CSn. 

Please amend the paragraph beginning at page 25, line 16, to read as follows: 
The port number conversion circuit 18 supplies to the bank 19 a bank selection signal S 
indicating that this bank is specified [[from]] by the n row bank selection signals RSn and the 
n column bank selection signals CSn, it's own in-bank address A selected from the n in-bank 
addresses An, and it's own data D selected from the n sets of data Dn. 

Please amend the paragraph beginning at page 27, line 10, to read as follows: 

By closing the switch 27 in the port number conversion circuit 18a connected to the 

bank 19a which should be selected by an inferior a lower bit in an address applied to each 

port 1 1 , each port 1 1 can be connected to an arbitrary bank 1 9a. 

Please amend the paragraph beginning at page 29, line 12, to read as follows: 
However, one cycle is enough to eliminate a penalty due to the access contention, 
whereas several cycles to several - te n several tens of cycles are required to rewrite data in the 
cache in order to eliminate a penalty due to a cache miss. Therefore, it can be said that there is 
no problem if an access contention probability is substantially equal to a cache miss ratio. 

Please amend the paragraph beginning at page 30, line 12, to read as follows: 
For example, this integrated cache 10 has the following configuration. The instruction 
port is pre-decoded by inferior lower two bits in an address of each bank 19, and can access only 
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a specific bank 19. In case of having access to continuous addresses at the same time, the access 
contention does not necessarily occur. Further, having access to addresses which are not 
continuous is possible as long as inferior two bits in each address do not conflict with others. On 
the other hand, since the data port has a lower probability of having access to continuous 
addresses than that of the instruction port, it can access all of the 16 banks 19. 

Please amend the paragraph beginning at page 51, line 15, to read as follows: 
In the multi-port instruction/trace/data integrated cache 61 (which will be referred to as 
an integrated cache 61 hereinafter) according to the seventh embodiment, an output port used to 
l ead instruction data and trace data and an output port used to read usual data (word) a which is 
not an instruction^ [[are]] is provided as output [[ports]] port of a bank access circuit 44 which 
reads each data from each bank 42 of a multi-port bank memory 43. 
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